首页> 外文OA文献 >Invariantly Admissible Policy Iteration for a Class of Nonlinear Optimal Control Problems
【2h】

Invariantly Admissible Policy Iteration for a Class of Nonlinear Optimal Control Problems

机译:一类非线性最优系统的不变可容许策略迭代   控制问题

摘要

In this paper, we propose a generalized successive approximation method(SAM), called invariantly admissible policy iteration (PI), for finding thesolution to a class of input-affine nonlinear optimal control problems byiterations. Unlike the existing SAM, the proposed method updates the domain ofthe next policy and value function for admissibility (and invariance). In theexisting SAM, the admissibility of the generated policies are guaranteed underthe two implicit assumptions regarding Lyapunov's theorem and invariance, bothof which are presented and discussed in this paper and are generally not true.On the contrary, the proposed invariantly admissible PI guarantees theadmissibility in a more refined manner, without such assumptions. Theadmissibility and invariance of the updated region, with respect to thecorresponding policies, are mathematically prove under the specific invariantadmissible update rule. We also provide monotonic decreasing and uniformconvergence properties of the sequence of value functions under certainconditions. Finally, numerical simulations are presented to illustrate theproposed PI method and its effectiveness.
机译:在本文中,我们提出了一种通用的逐次逼近方法(SAM),称为不变容许策略迭代(PI),以通过迭代找到一类输入仿射非线性最优控制问题的解决方案。与现有的SAM不同,所提出的方法更新下一个策略和值函数的域以实现可接纳性(和不变性)。在现有的SAM中,在有关李雅普诺夫定理和不变性的两个隐含假设下,保证了生成策略的可采性,这两个假设在本文中都进行了介绍和讨论,并且通常是不正确的;相反,所提出的不变可采PI保证了A中的可采性。没有这种假设的更精致的方式。在特定的不变容许更新规则下,从数学上证明了更新区域相对于相应策略的可容许性和不变性。我们还提供了在一定条件下值函数序列的单调递减和一致收敛性。最后,通过数值模拟说明了所提出的PI方法及其有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号